Speculative Clustered Caches for Clustered Processors

نویسندگان

Dana S. Henry

Gabriel H. Loh

Rahul Sami

چکیده

Clustering is a technique for partitioning superscalar processor’s execution resources to simultaneously allow for more in-flight instructions, wider issue width, and more aggressive clock speeds. As either the size of individual clusters or the total number of clusters increases, the distance to the first level data cache increases as well. Although clustering may expose more parallelism by allowing a greater number of instructions to be simultaneously analyzed and issued, the gains may be obliterated if the latencies to memory grow too large. We propose to augment each cluster with a small, fast, simple Level Zero (L0) data cache that is accessed in parallel with a traditional L1 data cache. The difference between our solution and other proposed caching techniques for clustered processors is that we do not support versioning or coherence. This may occasionally result in a load instruction that reads a stale value from the L0 cache, but the common case is a low latency hit in the L0 cache. Our simulation studies show that 4KB, 2-way set associative L0 caches provide a 6.5-12.3% IPC improvement over a wide range of processor configurations.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Thesis - Vasileios Porpodas

Very Long Instruction Word (VLIW) processors are wide-issue statically scheduled processors. Instruction scheduling for these processors is performed by the compiler and is therefore a critical factor for its operation. Some VLIWs are clustered, a design that improves scalability to higher issue widths while improving energy efficiency and frequency. Their design is based on physically partitio...

متن کامل

The Increment Predictor for SpeculativeMultithreaded

|The speculative multithreading paradigm (speculative thread-level parallelism) is based on the concurrent execution of control-speculative threads. The eeciency of microarchitectures that adopt this paradigm strongly depends on the performance of the control and data speculation techniques. While control speculation is used to predict the most effective points where a thread can be spawned, da...

متن کامل

Parallel Pull-Based LRU: A Request Distribution Algorithm for Clustered Web Caches Using a DSM for Memory Mapped Networks

The SIRAC laboratory has developed SciFS, a Distributed Shared Memory (DSM) that tries to benefit from the high performances and the remote addressing capabilities of the Scalable Coherent Interface (SCI) memory mapped network. We use SciFS for high performance cluster computing but we now experiment with it to build large scale clustered web caches. We propose Whoops! a clustered web cache pro...

متن کامل

Thread-Spawning Schemes for Speculative Multithreading

Speculative multithreading has been recently proposed to boost performance by means of exploiting thread-level parallelism in applications difficult to parallelize. The performance of these processors heavily depends on the partitioning policy used to split the program into threads. Previous work uses heuristics to spawn speculative threads based on easily-detectable program constructs such as ...

متن کامل

Cluster Level Multithreading for VLIW Processors

Clustered VLIW embedded processors have become widespread due to benefits of simple hardware and lowpower. However, the ILP inmost of the applications today is limited and discourages the design of wider issue processors. Simultaneous MultiThreading (SMT) is a well known technique to improve the resource utilization by exploiting thread level ILP. However, implementing SMT is not feasible for e...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2002

Speculative Clustered Caches for Clustered Processors

نویسندگان

چکیده

منابع مشابه

Thesis - Vasileios Porpodas

The Increment Predictor for SpeculativeMultithreaded

Parallel Pull-Based LRU: A Request Distribution Algorithm for Clustered Web Caches Using a DSM for Memory Mapped Networks

Thread-Spawning Schemes for Speculative Multithreading

Cluster Level Multithreading for VLIW Processors

عنوان ژورنال:

اشتراک گذاری